Learn R Programming

bigmemory (version 3.12)

biglm.big.matrix, bigglm.big.matrix: Use Thomas Lumley's ``biglm'' package with a ``big.matrix''

Description

This is a wrapper to Thomas Lumley's biglm package, allowing its use with data stored in big.matrix objects.

Usage

biglm.big.matrix(formula, data, chunksize=NULL, ..., fc=NULL, 
  getNextChunkFunc=NULL)
bigglm.big.matrix(formula, data, chunksize=NULL, ..., fc=NULL,
  getNextChunkFunc=NULL)

Arguments

formula
a model formula.
data
a big.matrix or data.frame object.
chunksize
an integer maximum size of chunks of data to process iteratively; if this argument is not given, a suitable default is supplied
fc
the names of variables that are factors
getNextChunkFunc
a function which generates the next set of indices for the next chunk; if this argument is not given, a suitable default is supplied
...
the other parameters which can be specified are those supported by biglm and bigglm

Value

  • an object of class biglm.

Details

See biglm package for more information; chunksize defaults to floor(nrow(data)/ncol(data)^2).

These functions will be removed from bigmemory and located in a new package, biganalytics or bigmemoryanalytics, in the Fall of 2009.

References

Algorithm AS274 Applied Statistics (1992) Vol. 41, No.2

Thomas Lumley (2005). biglm: bounded memory linear and generalized linear models. R package version 0.7.

See Also

big.matrix

Examples

Run this code
# This example is quite silly, using the iris
# data.  But it shows that our wrapper to Lumley's biglm() function produces
# the same answer as the plain old lm() function.

x <- matrix(unlist(iris), ncol=5)
colnames(x) <- names(iris)
x <- as.big.matrix(x)
head(x)

silly.biglm <- biglm.big.matrix(Sepal.Length ~ Sepal.Width + Species, data=x, fc="Species")
summary(silly.biglm)

y <- data.frame(x[,])
y$Species <- as.factor(y$Species)
head(y)

silly.lm <- lm(Sepal.Length ~ Sepal.Width + Species, data=y)
summary(silly.lm)

Run the code above in your browser using DataLab